Update trigger#4
Open
tylerc-govsignals wants to merge 1041 commits into
Open
Conversation
…ges (#3559) ## Summary Make `taskIdentifier` optional on the run-queue message schema. No behavior change in this PR; readers continue to accept payloads that include the field. A separate change will stop writing it on the wire to shrink the per-run payload that lives in Redis while runs wait to be dequeued. ## Design The field is written into every payload at enqueue time but no consumer reads it back on the dequeue path. Both the run-engine and supervisor derive `taskIdentifier` from the loaded `TaskRun` row instead. Relaxing the schema first means readers tolerate payloads that omit it, so the writer-side change can ship without producing schema-parse errors during a rolling deploy. `projectId` is left required: `WorkerQueueResolver.#getOverride` reads it for project-scoped runtime worker-queue overrides. ## Test plan - [x] `pnpm run typecheck --filter @internal/run-engine` - [x] `pnpm run typecheck --filter webapp` - [x] `pnpm run test ./src/run-queue/tests/enqueueMessage.test.ts ./src/run-queue/tests/workerQueueResolver.test.ts --run` (28/28 passing)
### Style updates to the notifications - Tightened up the typography - Brighter background to make it stand out a bit more - A bit more padding to make it more readable - Show the close button on hover instead - Turned the notification into a separate component as it's shared on the admin page modal - Minor tweaks to the behavior of toggling the notification beween open/closed side menu states ### Before <img width="224" height="313" alt="before" src="https://github.com/user-attachments/assets/c9a9377c-4a3b-4477-921a-3c86385d3f0b" /> ### After (with image) <img width="239" height="284" alt="CleanShot 2026-05-11 at 17 22 01" src="https://github.com/user-attachments/assets/311b4dbc-4853-4e6c-9f83-8173b38bd466" /> ### After (no image) <img width="239" height="189" alt="after" src="https://github.com/user-attachments/assets/884e062b-3608-4cb3-a462-d50597257753" /> --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
## Summary
1 improvement, 1 bug fix.
## Improvements
- Fail attempts on uncaught exceptions instead of hanging to
`MAX_DURATION_EXCEEDED`. A Node `EventEmitter` (e.g. `node-redis`)
emitting `"error"` with no `.on("error", ...)` listener escalates to
`uncaughtException`, which the worker previously reported but did not
act on — runs drifted to maxDuration with empty attempts. They now fail
fast with the original error and status `FAILED`, and respect the task's
normal retry policy. You should still attach `.on("error", ...)`
listeners to long-lived clients to handle errors gracefully.
([#3529](#3529))
## Bug fixes
- Fix dev workers spinning at 100% CPU after the parent CLI disconnects.
Orphaned `trigger-dev-run-worker` (and indexer) processes were caught in
an `uncaughtException` feedback loop: a periodic IPC send via
`process.send` would throw `ERR_IPC_CHANNEL_CLOSED` once the parent
closed the channel, which re-entered the same handler that itself called
`process.send`, scheduled via `setImmediate` and amplified by
source-map-support's `prepareStackTrace`. Fixed by (1) silently dropping
packets in `ZodIpcConnection` when the channel is disconnected, (2)
adding a `process.on("disconnect", ...)` handler in dev workers so they
exit cleanly when the CLI closes the IPC channel, and (3) wrapping all
`uncaughtException`-path `process.send` calls in a `safeSend` guard that
checks `process.connected` and swallows synchronous throws.
([#3491](#3491))
<details>
<summary>Raw changeset output</summary>
# Releases
## @trigger.dev/build@4.4.6
### Patch Changes
- Updated dependencies:
- `@trigger.dev/core@4.4.6`
## trigger.dev@4.4.6
### Patch Changes
- Fix dev workers spinning at 100% CPU after the parent CLI disconnects.
Orphaned `trigger-dev-run-worker` (and indexer) processes were caught in
an `uncaughtException` feedback loop: a periodic IPC send via
`process.send` would throw `ERR_IPC_CHANNEL_CLOSED` once the parent
closed the channel, which re-entered the same handler that itself called
`process.send`, scheduled via `setImmediate` and amplified by
source-map-support's `prepareStackTrace`. Fixed by (1) silently dropping
packets in `ZodIpcConnection` when the channel is disconnected, (2)
adding a `process.on("disconnect", ...)` handler in dev workers so they
exit cleanly when the CLI closes the IPC channel, and (3) wrapping all
`uncaughtException`-path `process.send` calls in a `safeSend` guard that
checks `process.connected` and swallows synchronous throws.
([#3491](#3491))
- Fail attempts on uncaught exceptions instead of hanging to
`MAX_DURATION_EXCEEDED`. A Node `EventEmitter` (e.g. `node-redis`)
emitting `"error"` with no `.on("error", ...)` listener escalates to
`uncaughtException`, which the worker previously reported but did not
act on — runs drifted to maxDuration with empty attempts. They now fail
fast with the original error and status `FAILED`, and respect the task's
normal retry policy. You should still attach `.on("error", ...)`
listeners to long-lived clients to handle errors gracefully.
([#3529](#3529))
- Updated dependencies:
- `@trigger.dev/core@4.4.6`
- `@trigger.dev/build@4.4.6`
- `@trigger.dev/schema-to-json@4.4.6`
## @trigger.dev/core@4.4.6
### Patch Changes
- Fix dev workers spinning at 100% CPU after the parent CLI disconnects.
Orphaned `trigger-dev-run-worker` (and indexer) processes were caught in
an `uncaughtException` feedback loop: a periodic IPC send via
`process.send` would throw `ERR_IPC_CHANNEL_CLOSED` once the parent
closed the channel, which re-entered the same handler that itself called
`process.send`, scheduled via `setImmediate` and amplified by
source-map-support's `prepareStackTrace`. Fixed by (1) silently dropping
packets in `ZodIpcConnection` when the channel is disconnected, (2)
adding a `process.on("disconnect", ...)` handler in dev workers so they
exit cleanly when the CLI closes the IPC channel, and (3) wrapping all
`uncaughtException`-path `process.send` calls in a `safeSend` guard that
checks `process.connected` and swallows synchronous throws.
([#3491](#3491))
- Fail attempts on uncaught exceptions instead of hanging to
`MAX_DURATION_EXCEEDED`. A Node `EventEmitter` (e.g. `node-redis`)
emitting `"error"` with no `.on("error", ...)` listener escalates to
`uncaughtException`, which the worker previously reported but did not
act on — runs drifted to maxDuration with empty attempts. They now fail
fast with the original error and status `FAILED`, and respect the task's
normal retry policy. You should still attach `.on("error", ...)`
listeners to long-lived clients to handle errors gracefully.
([#3529](#3529))
## @trigger.dev/python@4.4.6
### Patch Changes
- Updated dependencies:
- `@trigger.dev/core@4.4.6`
- `@trigger.dev/build@4.4.6`
- `@trigger.dev/sdk@4.4.6`
## @trigger.dev/react-hooks@4.4.6
### Patch Changes
- Updated dependencies:
- `@trigger.dev/core@4.4.6`
## @trigger.dev/redis-worker@4.4.6
### Patch Changes
- Updated dependencies:
- `@trigger.dev/core@4.4.6`
## @trigger.dev/rsc@4.4.6
### Patch Changes
- Updated dependencies:
- `@trigger.dev/core@4.4.6`
## @trigger.dev/schema-to-json@4.4.6
### Patch Changes
- Updated dependencies:
- `@trigger.dev/core@4.4.6`
## @trigger.dev/sdk@4.4.6
### Patch Changes
- Updated dependencies:
- `@trigger.dev/core@4.4.6`
</details>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…3552) Closes [TRI-9234](https://linear.app/triggerdotdev/issue/TRI-9234/retry-task-process-sigsegv-errors-respecting-user-retry-config) ## What this changes SIGSEGV crashes (`TASK_PROCESS_SIGSEGV`) will now be **retried when an attempt fails**, in line with the task's configured retry settings (`retry.maxAttempts` etc.) — the same path SIGTERM and uncaught exceptions already use. Previously SIGSEGV was hard-classified as non-retriable and failed the run on the first segfault, ignoring the user's retry policy. Tasks without a retry policy still fail fast on the first SIGSEGV. Behaviour is unchanged for OOM kills (separate machine-bump retry path) and SIGKILL_TIMEOUT. ## Deploy **Only the webapp needs to ship.** The retry decision lives entirely in the webapp: - V2 path: `internal-packages/run-engine` (bundled into the webapp) - V1 path: `apps/webapp/app/v3/services/completeAttempt.server.ts` No supervisor, CLI, SDK, or customer-task-image changes required. Customers do not need to redeploy. The `@trigger.dev/core` changeset is just keeping the public package in sync — the published npm version isn't what makes the fix work. ## Why retry SIGSEGV in Node tasks is frequently non-deterministic across processes: - **Native addon races** (`sharp`, `canvas`, `better-sqlite3`, `node-rdkafka`, `bcrypt`, …) — libuv thread-pool work stepping on V8 handles. Different heap layout / thread schedule on a fresh process → retry often succeeds. - **JIT / GC interaction** — V8 turbofan deopt or GC during a native callback. Timing-dependent. - **Near-OOM in native code** — when RSS approaches the cgroup limit, native allocations fail and poorly-written addons dereference NULL → SIGSEGV instead of clean OOM-kill. - **Host / hardware issues** — bit flips, kernel quirks. Retry lands on a different host. The genuinely deterministic case (a user-code bug always tripping the same addon) is real, but a subset — and `maxAttempts` bounds the damage. ## Pre-existing inconsistency this resolves - `shouldRetryError` returned `false` for `TASK_PROCESS_SIGSEGV` → `fail_run`. - `shouldLookupRetrySettings` already listed `TASK_PROCESS_SIGSEGV` as retry-config-aware — but that branch was unreachable because `shouldRetryError` short-circuited first in `retrying.ts:86-90`. - We already retry `TASK_RUN_UNCAUGHT_EXCEPTION` (clearly a user-code bug) under the user's retry policy; refusing to retry SIGSEGV was the odd one out. ## Test plan - [x] `pnpm exec vitest run test/errors.test.ts` in `packages/core` — 26/26 pass (4 new) - [x] `pnpm run build --filter @trigger.dev/core` - [ ] CI green on PR 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Summary Adds `.claude/REVIEW.md` — a repo-specific source of truth for what AI / agent code reviewers should treat as critical in this codebase (rolling-deploy safety, hot-table indexes, recovery-path queries, testcontainers usage, etc.). Pairs with a Claude-based PR audit that flags drift between REVIEW.md and the code as it evolves. ## How the audit works Mirrors the existing `.github/workflows/claude-md-audit.yml` pattern. On non-draft, non-fork PRs that touch code, `anthropics/claude-code-action` reads REVIEW.md, samples the PR diff, and posts a sticky comment with up to 3 of: - `[stale]` — rule cites a path / function / table that's been removed or renamed - `[contradiction]` — code in the PR violates a current rule - `[missing]` — PR introduces a new pattern future reviewers should know about - `[obsolete]` — rule asserts a constraint the repo has moved past If nothing's off, posts `✅ REVIEW.md looks current for this PR.` ## Test plan - [ ] Convert this PR to ready-for-review, confirm the audit runs and posts a sticky comment - [ ] Verify the audit doesn't run on fork PRs (gated by `head.repo.full_name == github.repository`) - [ ] Verify suggestions are actionable on at least one follow-up PR
…3499) ## Summary Consolidates the webapp's authentication and authorization into a small set of route helpers, replacing the ad-hoc `requireUser` / `requireUserId` / `authenticatedEnvironmentForAuthentication` calls scattered across routes. Same security model, but the per-request flow (authenticate → authorize → load) now lives in one place per route family. Introduces a plugin seam (`@trigger.dev/plugins`) that lets the cloud build install a richer RBAC implementation without touching webapp code. The OSS fallback keeps the pre-RBAC permissive behaviour intact, so self-hosted deployments work unchanged. Adds a comprehensive end-to-end auth test suite that didn't exist before — 193 `it()` blocks (vitest reports ~199 after `it.each` expansion) covering API key, PAT and JWT auth across the public API surface, plus dashboard session auth for admin pages. ## Changes ### Plugin contract — `@trigger.dev/plugins` `RoleBaseAccessController` interface authoritative for both OSS (fallback) and cloud (enterprise plugin): - `authenticateBearer(request, { allowJWT? })` — API-key / public-JWT auth, returns env + ability - `authenticateSession(request, { userId, organizationId?, projectId? })` — dashboard auth, caller resolves `userId` from the session cookie and passes it in (no `helpers.getSessionUserId` callback — decouples the plugin host from session-cookie code) - `authenticatePat(request, { organizationId?, projectId? })` — PAT auth, returns identity + `lastAccessedAt` so the host can throttle the per-request update - `authenticateAuthorize*` variants for the auth-and-check-in-one-call cases - `isUsingPlugin(): Promise<boolean>` — capability flag for UI / branching where plugin-present-ness matters; replaces the sentinel-string coupling that had `personalAccessToken.server` matching `"RBAC plugin not installed"` literally ### Dashboard auth (started, partial rollout) Admin and settings pages migrated to a unified `dashboardLoader` / `dashboardAction` helper that authenticates the session, runs an authorization check, and exposes the result to the route. Other dashboard routes still on the old pattern; remaining migration tracked in TRI-8730. Migrated routes: - `admin.*` (14 admin / back-office / feature-flags / LLM-models / notifications / orgs / concurrency pages) - `_app.orgs.$organizationSlug.settings.team` - `_app.orgs.$organizationSlug.settings.roles` ### API / realtime / engine auth (complete for the migrated families) 71 routes migrated to a unified `apiBuilder` that centralizes Bearer / PAT / Public-JWT authentication and applies the per-route authorization check before the handler runs. Includes: - `api.v1.*` and `api.v2.*` and `api.v3.*` — tasks, runs, batches, queues, prompts, deployments, query, sessions, waitpoints, packets, workers, idempotency keys - `realtime.v1.*` — runs, batches, sessions, streams - `engine.v1.*` — dev / worker-action protocols 29 routes still on the legacy `authenticateApiRequest*` helpers — tracked as a post-deploy follow-up in TRI-9228. Multi-resource auth direction is now explicit at the call site via `anyResource(...)` (OR) and `everyResource(...)` (AND). Bare arrays no longer typecheck — fixes a class of bug where a JWT scoped to one resource could implicitly access others under OR semantics. PAT auth path consolidated: was three DB queries per request (legacy `authenticateApiRequestWithPersonalAccessToken` findFirst + `rbac.authenticatePat` join + `lastAccessedAt` update). Now one query in the steady state — plugin returns `lastAccessedAt`, host smart-skips the update via JS-side throttle when fresh. Side effect: action aliases preserved historic JWT scope semantics where the new model is stricter (e.g. a `write:tasks` JWT now also satisfies `trigger` / `batchTrigger` / `update` actions on the same resource — matched at the auth boundary, not in the route handler). ### Backwards-compat fixes The strict-match model regressed several real-world JWT shapes. Each preserved via explicit `anyResource(...)` entries in the route's authz block: - **Batch retrieve routes** (`api.v1.batches.$batchId`, `api.v2.*`, `realtime.v1.batches.*`) accept `read:runs` JWTs again (pre-RBAC literal-match superScope behaviour) - **Runs list routes** (`api.v1.runs`, `realtime.v1.runs`) accept type-level `read:tasks` / `read:tags` on unfiltered queries (matched the legacy `Object.keys` iteration semantic) - **PAT/OAT auth shape** normalized through `toAuthenticated` so all auth methods return the same slim `AuthenticatedEnvironment` (was: API-key returned the slim shape but PAT/OAT returned raw Prisma `Decimal` / no `orgMember`) - **Scope `:` preservation** in resource ids — `read:tags:env:staging` now correctly identifies the tag id as `env:staging`, not `env` ### Slim `AuthenticatedEnvironment` Extracted to `@trigger.dev/core/v3/auth/environment` — a structural shape independent of `@trigger.dev/database`. The plugin contract returns this; webapp consumers import from there; the cloud plugin (Drizzle) returns the same shape without Prisma's `Decimal` class leaking into the public surface. Lets internal-packages (run-engine, etc.) refer to `AuthenticatedEnvironment` without pulling Prisma in. ### Auth test suite (new — `*.e2e.full.test.ts`) 193 e2e tests run against a real spawned webapp + Postgres (no mocks). Coverage matrix: - **API key auth** — read / write / trigger / batchTrigger / deploy actions across runs, batches, deployments, prompts, queues, query, sessions, input-streams, waitpoints, tasks, idempotency keys; multi-key resources (a run carries batch / tag / task identifiers — auth must accept any matching scope) - **Personal Access Token auth** — comprehensive matrix: scope match, scope mismatch, missing scope, expired token, malformed token - **Public JWT auth** — sub-vs-URL environment resolution, expired JWTs, signature verification, scope checking, otu (one-time-use) token semantics, branch-environment signing-key fallback - **Dashboard session auth** — admin-only pages reject non-admins; per-action gating - **Cross-cutting edge cases** — revoked API key grace window, JWT cross-environment isolation, MissingResource branch behaviour ### Hygiene cleanups - Deleted dead `app/services/authorization.server.ts` (legacy `checkAuthorization` + types — no live consumers post-migration) and its orphaned test - Dropped the never-populated `scopes` field from `ApiAuthenticationResultSuccess` - `scheduleEmail` moved out of `email.server.ts` into its own module — breaks a `commonWorker → marqs/V1` import chain that was poisoning the auth test graph - OSS Roles page shows a deployment-aware empty state ("Roles aren't available in this self-hosted deployment" vs the plan-upsell copy) via `rbac.isUsingPlugin()` - Team action handler: explicit per-intent ability gates (`manage:billing` for purchase-seats, `manage:members` for set-role + remove-member with self-leave carve-out) ### Cross-repo coordination All public-package contract changes paired in `triggerdotdev/cloud#763` (rbac-packages branch) — the enterprise plugin implements the same `RoleBaseAccessController` interface against Drizzle. ## Test plan - [x] `pnpm run typecheck --filter webapp` clean - [x] `pnpm --filter webapp exec vitest run --config vitest.e2e.full.config.ts` — 193/193 pass (requires Docker for testcontainers) - [x] Spot-check an authed API endpoint with a valid + invalid API key against a local stack - [x] Spot-check the migrated admin pages render and gate non-admins --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… queues (#3558) ## Summary Queues that use concurrency keys can no longer bypass the per-queue length cap, and the "Queued | Running" columns in the dashboard now show the true total across all CK variants instead of 0. The cap and the dashboard both relied on `ZCARD` of the base queue key, but CK-keyed runs live under `<base>:ck:<variant>` keys. Any queue that used concurrency keys read 0 — letting a single CK variant grow unbounded past the user's configured cap. ## Fix Two per-base-queue counters are maintained inside the CK Lua scripts: `<base>:lengthCounter` and `<base>:runningCounter`. Non-CK enqueue/dequeue paths are untouched. Counters are lazy-initialized the first time a CK enqueue (or nack) lands on a queue: the Lua script sums `ZCARD` across the variants tracked by `ckIndex`, sets the counter, then `INCR`s. Pre-existing CK backlog on already-populated queues is captured automatically — no batch migration required. `INCR`/`DECR` is gated on `ZADD`/`SADD` returning 1 (a new entry vs an idempotent no-op), so duplicate enqueues or re-dequeues don't inflate the counter. The counter is `SET` with a 24-hour TTL on init. `INCR`/`DECR` do not extend the TTL, so the counter expires daily and the next CK operation re-seeds it from `ckIndex`. This bounds any drift that accumulates during the rolling-deploy overlap window — where old (un-Tracked) and new (Tracked) webapp instances briefly coexist — to ≤24 hours, with no admin sweep or background reconciler needed. Read paths pipeline `ZCARD`/`SCARD` on the base key + `GET` on the counter and sum. A missing counter is treated as 0, so pure non-CK queues see the same answer as before. The counter-aware scripts ship alongside the originals with a `Tracked` suffix for rolling-deploy safety; a follow-up PR will drop the originals once this has rolled out. ## Test plan - [ ] `pnpm run test --filter @internal/run-engine` — 116 tests pass, including a new `ckCounters.test.ts` covering lazy init from pre-existing backlog, churn, floor-at-zero, the non-CK regression case, mixed CK + non-CK on the same base queue, idempotent re-enqueue (ZADD-already-exists), 24h TTL on the counter, and nack re-seeding after counter expiry. - [ ] Verified end-to-end against a live local environment: - Triggered 24 CK enqueues across 4 variants → `lengthCounter=16`, `runningCounter=8`, dashboard showed Queued=16 / Running=8 for the CK queue. - Set the env queue cap to 16, triggered 12 more enqueues → 8 succeeded, 4 rejected with `QueueSizeLimitExceededError`. - Deleted the counter on a queue with 31 messages already sitting in CK variants, triggered one more enqueue → counter materialized to 31 from the `ckIndex` sum, then INCR'd.
## Summary Local ClickHouse was burning ~325% CPU endlessly merging its own telemetry tables (`metric_log`, `asynchronous_metric_log`, `part_log`, `trace_log`) after the container had been running long enough to accumulate hundreds of GB of system-log data. OrbStack Helper reflected this on the host (~400% CPU). These tables are not used by anything in the dev stack. They only exist for ClickHouse to log itself, so disabling them eliminates the merge churn entirely. ## Changes - Adds `docker/config/clickhouse-disable-system-logs.xml`, mounted into `/etc/clickhouse-server/config.d/`, that removes the noisy system log tables via `<table remove="1"/>`. - Mounts the override file in `docker/docker-compose.yml`. After applying, idle CPU dropped from 325% to ~12% on my machine. ## Test plan - [ ] `pnpm run docker` brings up the stack cleanly - [ ] `docker stats clickhouse` shows low idle CPU - [ ] App functionality unaffected (system log tables are not queried by the webapp)
…mpling (#3567) ## Summary Follow-up to #3561. The drift-audit workflow timed out on PR #3542 (92 files, +5962 lines) by hitting `--max-turns 15` before reaching a verdict, leaving a red ❌ on that PR with no sticky comment. ## Changes - `--max-turns` bumped from 15 to 30. - Prompt now opens with an explicit "Strategy" section: read REVIEW.md once, scan the file-list only, open at most 5 files (3-5 on PRs >50 files), and bias toward finishing over exploring. - Final rule: *"when in doubt between one more file read and finish now — finish now."* The audit is allowed to miss things. It is not allowed to time out and leave a red X. ## Test plan - [ ] Verify this PR's audit posts `✅ REVIEW.md looks current for this PR.` (small diff) - [ ] After merge, retry the audit on #3542 or a similarly large PR and confirm it completes
…#3564) ## Summary - Users on production are hitting `QuotaExceededError: Failed to execute 'setItem' on 'Storage'` when navigating runs, because their localStorage is full of orphaned `panel-group-react-aria<n>-:<rid>:` entries. - Each entry is a session-unique key written by the resizable panel library; they accumulated to thousands per user over the last two months and now block legitimate `setItem` calls (the run-view inspector can no longer persist its layout, and the page crashes mid-render). - This PR evicts the legacy entries once on client boot. The leak itself is already plugged by the v1.1.3 upgrade in #XXXX — this is the cleanup that recovers the wasted quota on existing users' machines. ## Root cause (already fixed, for context) In v0.4.1 of the underlying library, `PanelGroupImpl` defaulted `autosaveStrategy` to `"localStorage"` unconditionally — so *every* `PanelGroup` wrote to localStorage on every autosave trigger, including the four in `QueryEditor`, the one in `ReplayRunDialog`, the storybook routes, etc. Without an `autosaveId`, the key fell back to `panel-group-${useId()}`, and React Aria's `useId()` produces a new session-unique prefix each visit. Result: entries accumulated without bound across sessions. The condition was introduced when [#3282](#3282) removed the wrapper's explicit `autosaveStrategy="cookie"` override (to fix HTTP 431 cookie-size errors). That worked, but the library default that took over silently caused this leak. The v1.1.3 upgrade in the resizable-panel PR changed the default to `autosaveStrategy = autosaveId ? "localStorage" : undefined`, so no new entries are being written. Existing residue still needs to be removed from users' browsers. ## Changes - New file [`apps/webapp/app/clientBeforeFirstRender.ts`](apps/webapp/app/clientBeforeFirstRender.ts) — exports a `clientBeforeFirstRender()` function that runs synchronously, before React hydrates. Encapsulates a small cleanup helper that scans `localStorage` and removes: - Every key starting with `panel-group-react-aria` (the legacy auto-generated keys). - The orphan `panel-run-parent-v2` key from before the autosaveId v2→v3 bump. - [`apps/webapp/app/entry.client.tsx`](apps/webapp/app/entry.client.tsx) — imports and invokes `clientBeforeFirstRender()` once, before `hydrateRoot()`. This guarantees the cleanup completes before any `ResizablePanelGroup` mounts and tries to write. The cleanup is wrapped in `try/catch` so private-browsing / disabled-storage scenarios fail silently. Idempotent: subsequent loads find no matching keys and exit immediately. ## Test plan - [x] Locally seed ~50 fake `panel-group-react-aria…` entries plus a `panel-run-parent-v2` entry via DevTools console, hard reload → legacy entries gone, real entries (`panel-run-parent-v3`, `panel-run-tree`) preserved. - [x] Idempotency: reload a second time, no errors, no state changes. - [x] Add a control entry (`panel-run-parent-v3-but-different-suffix`) — confirmed not over-matched. - [x] Simulate broken `Storage.setItem` throwing — page still renders, cleanup swallows the error. - [x] Typecheck clean. ## Notes - Customer report: `QuotaExceededError: Failed to execute 'setItem' on 'Storage': Setting the value of 'panel-run-parent-v3' exceeded the quota.` - The cleanup runs once per page load. Once a user has loaded the app after this deploys, their localStorage is clean and the function becomes a no-op forever.
## Summary - Recommend deploying NodeLocal DNS and lowering `ndots` to `1` in the Kubernetes self-hosting guide. - Recommend storing task events in ClickHouse (`EVENT_REPOSITORY_DEFAULT_STORE=clickhouse_v2`) in both the Docker and Kubernetes guides, plus a new row in the webapp env var reference.
`pr_checks` runs the full matrix on every PR. #3609 touched only `apps/webapp/app/routes/admin.tsx` and still ran the 4-job CLI e2e matrix and 5-job sdk-compat suite. Adds a `changes` job using `dorny/paths-filter` and gates each tier: - webapp + e2e-webapp: `apps/webapp/**`, `packages/**`, `internal-packages/**` - packages: `packages/**` - internal: `internal-packages/**` + `packages/**` (cross-deps) - e2e (cli-v3): `packages/{cli-v3,build,core,schema-to-json}/**` - sdk-compat: `packages/{trigger-sdk,core}/**` `.configs/**`, `package.json`, `pnpm-lock.yaml`, `pnpm-workspace.yaml`, `turbo.json` are also included in every filter since they affect the whole workspace. Inlines the `units` reusable-workflow children so each can be gated independently (status check names also flatten from `units / webapp / ...` to `webapp / ...`). `unit-tests.yml` is unaffected - still used by `publish.yml`. Adds an `all-checks` gate that always runs and short-circuits to success when every dependent is success-or-skipped. With this in place a single required status check (`All PR Checks`) is enough; before this, `paths-ignore` would have left required checks Pending on docs/changeset PRs ([gh docs](https://docs.github.com/en/actions/managing-workflow-runs/skipping-workflow-runs)).
…nizations (#3609) Switching between the Users and Organizations tabs in the admin dashboard now keeps the current `?search=` value, so you can flip between the two without re-typing your filter. Other admin tabs don't take `search` and so don't carry it.
Adds Sessions, a durable, run-aware stream primitive that scopes session.in / session.out records to a session (not a single run). Records survive run boundaries; reconnect-from-last-event-id is built in. Server foundation: - New /realtime/v1/sessions/:session/:io/append + /records routes - sessionRunManager + sessionsRepository + clickhouseSessionsRepository - mintRunToken for short-lived per-session tokens - s2Append retry-with-backoff + undici cause diagnostics - /api/v[12]/packets/* exempt from customer rate limits - BackgroundWorker schema gains taskKind enum (TASK, AGENT, SCHEDULED) - TaskRun.taskKind column + clickhouse 029_add_task_kind_to_task_runs_v2 Core types: - new sessionStreams, inputStreams, realtimeStreams packages in @trigger.dev/core - session-streams-api / realtime-streams-api surface Sessions dashboard UI (the primitive's own viewer): - /sessions index + detail routes - SessionsTable, SessionFilters, SessionStatus, CloseSessionDialog - AGENT/SCHEDULED filter in RunFilters + TaskTriggerSource Includes the sessions-primitive changeset.
`tasks.trigger`, `tasks.batchTrigger`, `batch.create`, `wait.createToken`, `wait.forDuration`, and the input/session stream waitpoint endpoints all accept a caller-supplied `idempotencyKey` and store it verbatim against a composite-unique index on `TaskRun`, `BatchTaskRun`, or `Waitpoint`. The schemas had no length cap, so a sufficiently long high-entropy key produced an index row larger than the underlying storage layer can hold. The insert failed at the database, and the caller saw a generic 500 from `RunEngineTriggerTaskService.call()` / `CreateBatchService` / waitpoint creation, depending on the endpoint. Keys produced by `idempotencyKeys.create()` are 64-character SHA-256 hashes and never trip this — it only manifests for direct REST callers (or SDK callers passing a raw string they generated themselves). Low-entropy keys also sail through, because the storage layer compresses repeated bytes before they reach the index, which is why the failure mode is intermittent and tied to caller-side key shape. ## Fix Add `.max(2048, "<field> must be 2048 characters or less")` to the seven schemas that feed an indexed `idempotencyKey` column: - `TriggerTaskRequestBody.options.idempotencyKey` - `BatchTriggerTaskItem.options.idempotencyKey` - `CreateBatchRequestBody.idempotencyKey` - `CreateWaitpointTokenRequestBody.idempotencyKey` - `CreateInputStreamWaitpointRequestBody.idempotencyKey` - `CreateSessionStreamWaitpointRequestBody.idempotencyKey` - `WaitForDurationRequestBody.idempotencyKey` Plus the `idempotency-key` HTTP header on the trigger route (and the three batch routes that re-export `HeadersSchema`). The header schema is lifted out of `api.v1.tasks.$taskId.trigger.ts` into `apps/webapp/app/v3/triggerHeaders.server.ts` so it can be exercised in tests without dragging the route's import-time side effects. The 2048 character ceiling is chosen to sit safely under the per-row index limit while staying generous against existing callers — keys that fit before still fit. Oversized keys now return a structured Zod 400 instead of a generic 500. Limit is documented under `Idempotency key` in `docs/limits.mdx` and as a `<Note>` on `docs/idempotency.mdx`. ## Test plan - [x] 15 schema unit tests added (`packages/core/src/v3/schemas/idempotencyKey.test.ts`, `apps/webapp/test/routes/triggerHeaders.test.ts`) — rejection-with-message + boundary acceptance for each capped schema. The webapp test exercises the extracted `TriggerHeadersSchema` directly with no mocks. - [x] `pnpm run build --filter @trigger.dev/core` - [x] `pnpm run typecheck --filter webapp` - [x] End-to-end verified locally: baseline (small key) → 200; 3000-char high-entropy header → 400 with the expected Zod error; same key at the 2048 boundary → 200; same key with the cap reverted → the database rejected the insert and the route returned 500 to the caller. Cap restored. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…3542) ## Summary A `/sessions` dashboard for inspecting durable Sessions, an `AGENT` / `SCHEDULED` task-kind filter for the runs list, and the server-side hardening (rate-limit exemption for packets, retry-with-backoff on stream appends, typed too-large-chunk error) that the `chat.agent` runtime in #3543 needs. Builds on the Sessions primitive shipped in #3417. ## Design The Sessions list + detail routes mirror the run inspector pattern. `TaskTriggerSource` gains `AGENT` and `SCHEDULED` values, persisted on `BackgroundWorker.taskKind` and `TaskRun.taskKind` (plus a matching Clickhouse column), so the runs list can filter by kind. New `@trigger.dev/core` modules — `sessionStreams`, `inputStreams`, a `sessionStreamInstance` for realtime streams, and the `realtime-streams-api` / `session-streams-api` surfaces — expose the typed shapes that chat.agent will use to drive `session.out`. `ChatChunkTooLargeError` lets the runtime drop oversized chunks with a typed surface instead of failing the run. `s2Append` retries transient failures with exponential backoff. `/api/v[12]/packets/*` is exempt from customer rate limits so chat snapshot reads and writes don't get throttled under load. ## Stack Part of a 4-PR stack. Merge bottom-up. 1. **This PR** (#3542) → `main` 2. #3543 → #3542 — `chat.agent` runtime + browser transport 3. #3545 → #3543 — agent-view dashboard 4. #3546 → #3545 — ai-chat reference + MCP tooling Replaces #3173 (closed). <!-- GitButler Footer Boundary Top --> --- This is **part 5 of 5 in a stack** made with GitButler: - <kbd> 5 </kbd> #3612 - <kbd> 4 </kbd> #3546 - <kbd> 3 </kbd> #3545 - <kbd> 2 </kbd> #3543 - <kbd> 1 </kbd> #3542 👈 <!-- GitButler Footer Boundary Bottom -->
The `code` paths filter currently matches `**` minus a tiny exclusion list, so a PR that only touches `.github/workflows/*.yml` still flips `code == true` and runs typecheck (~2 min on the runner). Exclude `.github/**` from `code`, then re-include just `pr_checks.yml` and `typecheck.yml` so a change to either of those still triggers the full code check matrix. Effect: - workflow-only PRs (this one, future dependabot/codeql/etc.) skip typecheck; `all-checks` treats the skipped job as non-failure so the required status passes. - modifying `pr_checks.yml` or `typecheck.yml` themselves still triggers typecheck. - the existing per-suite filters (`webapp`, `packages`, `internal`, `cli`, `sdk`) already re-include the specific workflows that gate them, so they're unaffected.
Adds a Mon 08:00 UTC workflow that posts a summary of open Dependabot alerts and PRs to Slack. Uses env-scoped secrets so the alerts PAT and Slack token are only available to this workflow.
Adds the chat.agent({...}) task definition (server runtime) and the
browser-side TriggerChatTransport + AgentChat that drives it from a
React or Next.js app. The runtime sits on top of the Sessions primitive
and handles the durable conversational task lifecycle.
Server runtime:
- chat.agent({...}) — session-aware task definition
- Lifecycle hooks: onChatStart, onTurnStart, onTurnComplete, onAction,
onValidateMessages, hydrateMessages
- chat.history read primitives for HITL flows
- chat.local, chat.headStart, chat.handover, oomMachine
- Delta-only wire + S3 snapshot reconstruction at run boot
- Actions are no longer turns
Browser transport:
- TriggerChatTransport (ai-sdk Transport): delta-only wire sends,
SSE reconnection with lastEventId resume, stop/abort cleanup,
dynamic accessToken refresh
- AgentChat: direct programmatic API
- useTriggerChatTransport (React hook)
- chat-tab-coordinator: cross-tab leader election
Includes the chat-agent, chat-agent-delta-wire-snapshots,
chat-history-read-primitives, chat-head-start, chat-actions-no-turn,
chat-session-attributes, agent-skills, and mock-chat-agent-test-harness
changesets.
## Summary
Adds `chat.agent({...})`, a durable conversational task runtime, plus
the browser-side `TriggerChatTransport` + `AgentChat` that drive it from
a React or Next.js app. Conversations survive page refreshes, network
blips, idle suspend, and process restarts, with built-in tools, HITL
approvals, multi-turn state, and stop-mid-stream cancellation. Builds on
#3542.
## Design
Each `/in/append` request carries at most one new message. The agent
reconstructs prior history at run boot from an object-store snapshot
plus a `session.out` replay tail, so conversation context lives
server-side instead of bloating the wire. Awaited snapshot writes after
every `onTurnComplete` keep the chain durable across idle suspend.
Registering `hydrateMessages` short-circuits both paths for customers
who own their own conversation store.
Lifecycle hooks — `onChatStart`, `onTurnStart`, `onTurnComplete`,
`onAction`, `onValidateMessages`, `hydrateMessages` — cover validation,
persistence, and post-turn work. `chat.history` exposes read primitives
(`getPendingToolCalls`, `getResolvedToolCalls`, `extractNewToolResults`,
`findMessage`, `all`) for HITL flows. `chat.local` gives per-run typed
state with Proxy access and dirty tracking. `chat.headStart` bridges
first-turn TTFC via a customer HTTP handler. `oomMachine` opts a chat
into one-shot OOM-retry on a larger machine.
`TriggerChatTransport` is a `Transport` implementation for Vercel's
ai-sdk `useChat`: delta-only wire sends, SSE reconnection with
`lastEventId` resume, stop/abort cleanup, dynamic `accessToken` refresh,
`X-Peek-Settled` fast-close. `AgentChat` is the direct programmatic
equivalent. A cross-tab coordinator does leader election so multiple
open tabs share a single SSE.
```ts
import { chat } from "@trigger.dev/sdk/ai";
import { streamText } from "ai";
export const myChat = chat.agent({
id: "my-chat",
run: async ({ messages, signal }) =>
streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }),
});
```
#3610) Concurrent `POST /api/v1/deployments` requests for the same environment race on the `WorkerDeployment(environmentId, version)` unique constraint. Both requests read the same latest deployment via `findFirst`, compute the same next version via `calculateNextBuildVersion`, and both attempt `prisma.workerDeployment.create()` — one wins, the other crashes with Prisma `P2002`. The bug is a classic TOCTOU between the version read and the version write; it's been latent since the version-assignment logic was first added but only fires when two deploys land within milliseconds of each other (CI matrices, retried CLI calls, webhook-triggered redeploys). ## Approach Extracts the version assignment + create into a small helper `createDeploymentWithNextVersion` (`apps/webapp/app/v3/services/initializeDeployment/createDeploymentWithNextVersion.server.ts`). The helper retries on `P2002 (environmentId, version)` up to 5 times with randomised 5–50ms jitter so N concurrent racers don't loop in lockstep. Each attempt re-reads the latest version, recomputes via `calculateNextBuildVersion`, and re-runs the caller's `buildData` callback so version-dependent fields (image ref tag, friendlyId) are always consistent with the version actually persisted. A `logger.warn` fires per collision so the retry rate is observable in production logs. When retries are exhausted, the helper throws a dedicated `DeploymentVersionCollisionError` carrying `environmentId`, `attempts`, and `lastAttemptedVersion`, with the original `PrismaClientKnownRequestError` attached as `cause`. Sentry walks the `cause` chain natively, so contention exhaustion shows up as a distinguishable wrapper exception linked to the underlying `P2002` rather than a generic unique-constraint violation that looks identical to every other duplicate-key bug. The behavioural change is limited to "catch P2002 and retry instead of crashing." The image ref computation stays inside the builder callback (same call site as before the refactor), so ECR / non-ECR behaviour, S2 stream creation order, and all downstream side effects are unchanged. ## Non-goals - No new database migrations, no schema changes, no isolation-level / locking changes. A serialisable transaction or advisory lock would also fix this; retry-on-conflict is the smaller change that keeps the existing version-allocation logic intact. - Does not touch the analogous `calculateNextBuildVersion` call in `createBackgroundWorker.server.ts`, which likely has the same race shape against `BackgroundWorker`'s unique constraint — flagged as a follow-up. ## Test plan - [x] `pnpm run typecheck --filter webapp` passes (no new errors in the modified files). - [x] Three real-Postgres tests in `apps/webapp/test/createDeploymentWithNextVersion.test.ts` via `containerTest`: - 5 concurrent calls all produce distinct, persistable versions (`Set(versions).size === concurrency`). The naive read-then-create version of the helper fails this test with the exact same `P2002` seen in production; the retry version passes. - Non-`P2002` errors raised from the `buildData` callback propagate immediately without retry, builder invoked exactly once. - With `maxRetries: 0`, concurrent racers surface the wrapped `DeploymentVersionCollisionError` (not a raw `P2002`); `environmentId`, `attempts`, `lastAttemptedVersion` are populated and `error.cause.code === "P2002"`. - [x] Existing `apps/webapp/test/getDeploymentImageRef.test.ts` still green (the file was untouched in the final diff). ## Follow-ups (not in this PR) - `createBackgroundWorker.server.ts` likely has the same TOCTOU shape against its background-worker version unique constraint — should use the same helper. - Sentry visibility check: confirm `error.cause` chain renders as a linked exception in the Sentry UI when the wrapped error fires (requires a sandboxed triggering of the exhaustion path). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dashboard surfaces for inspecting and debugging chat.agent runs. Depends on the Sessions primitive (L1) and chat.agent runtime (L2+L3). Run inspector — chat-aware: - AgentView + AgentMessageView (run inspector tab for chat.agent runs) - AIChatMessages + AISpanDetails + types.ts (per-span chat message rendering, tool-call/tool-output handling) - PromptSpanDetails (gen_ai.* span detail panel) - StreamdownRenderer + shikiTheme (markdown renderer with shiki highlighting and v2 patch) - useAutoScrollToBottom hook Playground UI (interactive chat.agent debugger): - /playground index + /playground/$agentParam routes - /agents route + AgentListPresenter - PlaygroundPresenter (per-org basin variants, clientData wiring) - realtime session routes for playground + run inspector chat - AI-generate-payload + AIPayloadTabContent for the test panel Navigation + theming: - SideMenu links for Agents and Playground - BlankStatePanels copy updates - tailwind config + tailwind.css storybook hooks - streamdown@2 dep in apps/webapp/package.json Includes agent-view-sessions, playground-trigger-config-fields, run-agent-view, and streamdown-v2-upgrade .server-changes.
## Summary A chat-aware run inspector and a `/playground` UI for testing `chat.agent` tasks interactively. Builds on #3543's runtime. ## Design The run inspector grows a new tab that renders the conversation chain for any `chat.agent`-kind run. It subscribes to the run's session streams, threads chat parts through a per-message renderer, and uses a shared markdown + Shiki component for code highlighting (also used by the test-payload panel). The playground is a standalone `/playground` route that lets you drive a deployed chat agent from the dashboard — pick a task, send messages, watch tool calls render, and see span detail on every turn. The matching `/agents` list view shows all deployed agents in the project.
Top of the chat.agent stack: a full Next.js reference project that exercises chat.agent end-to-end, plus the CLI MCP tools that drive agent runs from Claude Code / Cursor / etc. references/ai-chat: - Full Next.js app with prisma persistence, multi-chat sidebar, per-chat model picker, debug panel, tool examples, smoke tests - Reference tools: getCurrentTime, searchHackerNews, createGithubIssue, PR review helpers, code sandbox - chat-client-test orchestrator for concurrent-send stress - references/hello-world chatAgent + triggerAndSubscribe examples CLI MCP tooling for chat.agent: - mcp/tools/agentChat.ts (start_agent_chat, send_agent_message, close_agent_chat) - mcp/tools/agents.ts + tasks.ts (list agents, agent run details) - dev-run-controller OOM kill + taskRunProcessPool tweaks - dev/managed entry-point hooks for skills bundling - buildWorker + bundleSkills (agent skills support) Includes ai-tool-helpers + mcp-agent-chat-sessions changesets, plus the streamdown@2 patch and pnpm-lock reconciliation. (Will be renamed to feature/ai-chat-reference-and-cli before push.) fix(cli): preserve lastEventId after sendMessage fallback to avoid stale turn-complete replay
## Summary A complete Next.js reference project that exercises `chat.agent` end-to-end, plus the CLI MCP tools that let Claude Code, Cursor, and similar IDE agents drive a deployed `chat.agent` task from the editor. Builds on #3545. ## Design `references/ai-chat` is a full Next.js app: prisma-backed persistence, multi-chat sidebar, per-chat model picker, debug panel, tool examples (`getCurrentTime`, `searchHackerNews`, `createGithubIssue`, PR review helpers, code sandbox), and smoke tests. It's intended both as a copy-paste starting point and as a place to regression-test SDK changes. The CLI gains MCP tools (`start_agent_chat`, `send_agent_message`, `close_agent_chat`, `list_agents`) so an IDE agent can converse with a deployed `chat.agent` task. The dev runtime adds one-shot OOM kill on the run controller and skills bundling in the build pipeline.
Follow-up to #3615. The `code` filter currently fires typecheck for any change outside `docs/`, `.changeset/`, `hosting/`, or `.github/` - so a docs-only PR like #3623 (touching `references/ai-chat/.env.example` + `README.md`) triggered the typecheck job. None of the `references/*` packages declare a `typecheck` script either, so even when a real code change lands there, `turbo run typecheck` skips them. Running the job is pure cost. Tightens the filter to also exclude: - `references/**` - playground projects, none of them contribute to `turbo run typecheck` today - `**/*.md` - markdown anywhere - `**/.env.example` - example env files anywhere Two known gaps left open: - references/ have no real CI typecheck coverage. Separate question - either add `typecheck` scripts to each (or top-level `tsc -p`), or accept playground status. - `changes` job still runs (it's a path-filter step) but the dependent jobs all skip on irrelevant PRs.
…builds (#3626) ## Summary `LocalsKey<T>` (the type returned by `locals.create()`) was branded with a module-level `declare const __local: unique symbol`. Each such declaration is its own nominal type, and `tshy` emits separate `.d.ts` files for the ESM and CJS outputs — each gets its own `__local` symbol. Under certain pnpm hoisting layouts a single TypeScript compilation can resolve `LocalsKey` from both the ESM source path and the CJS dist path within the same call site, producing two structurally-incompatible variants of the same type. TS surfaces this as the misleading error: ``` Argument of type 'LocalsKey<X>' is not assignable to parameter of type 'LocalsKey<X>'. Property '[__local]' is missing in type 'LocalsKey<X>' but required in type 'BrandLocal<X>'. ``` The error has been hitting CI on PRs opened since the chat.agent stack landed (e.g. #3625 typecheck job), but doesn't reproduce on developer machines where the pnpm node_modules layout was built up incrementally. ## Fix Replace the `unique symbol` brand with an optional phantom field that carries `T` at the type level: ```ts // before declare const __local: unique symbol; type BrandLocal<T> = { [__local]: T }; export type LocalsKey<T> = BrandLocal<T> & { readonly id: string; readonly __type: unique symbol; }; // after export type LocalsKey<T> = { readonly id: string; readonly __type: symbol; /** Phantom carrier for the value type — never read at runtime. */ readonly __valueType?: T; }; ``` The ESM and CJS `.d.ts` outputs now produce structurally identical types, so cross-output resolution no longer produces a mismatch. `T` is still carried at the type level via the optional phantom field. The runtime shape is unchanged — `manager.ts` was already casting via `as unknown`, which is no longer needed. ## Test plan - [ ] `pnpm run typecheck --filter @trigger.dev/core --filter @trigger.dev/sdk` - [ ] `pnpm run build --filter @trigger.dev/core --filter @trigger.dev/sdk` (clean rebuild) — confirms the ESM and CJS dist `.d.ts` outputs no longer carry distinct `unique symbol` declarations - [ ] `pnpm --filter @trigger.dev/core test test/mockTaskContext.test.ts --run` - [ ] `pnpm --filter @trigger.dev/sdk test test/mockChatAgent.test.ts --run`
## Summary A "Google auth conflict" Sentry alert fires whenever a user signs in via Google whose Google account is linked to one user row but whose Google-provided email is now on a *different* user row. The handler in `apps/webapp/app/models/user.server.ts:236` already does the right thing — it returns the existing auth-linked user and skips the update path so neither row gets mutated — but it logs the situation with `logger.error`, which routes to Sentry as an exception and pages the on-call channel. There's no exception to chase here: the branch is the intended outcome for a known data shape (user changed their email on one account after originally signing up via Google on another). Downgrading the call to `logger.warn` keeps the diagnostic record in our logs (with all the same context fields — email, both user IDs, authIdentifier) but stops it firing the production error alert. ## Change - `logger.error` → `logger.warn` for the conflict branch in `findOrCreateGoogleUser`. Context payload is unchanged. ## Test plan - [x] Typecheck only — there's no behavioural change to test, the log level is the entire diff.
…3625) ## Summary The trigger-task hotpath used to early-return without a DB query when a caller passed both a queue override and a per-trigger TTL — the hottest configuration on the trigger API. Adding `triggerSource` to the resolver so the runs-list "Source" filter could distinguish STANDARD / SCHEDULED / AGENT runs removed those early-returns, costing +2 DB queries per trigger on non-locked calls and +1 on locked calls. This change caches `BackgroundWorkerTask` metadata (`ttl`, `triggerSource`, `queueId`, `queueName`) in Redis so the resolver can satisfy every caller configuration with a single `HGET` on the warm path. PG fallback on miss back-fills the cache. Follow-up to #3542. ## Design Two key spaces: - `task-meta:env:{envId}` — the "current worker" view, refreshed at every deploy promotion. 24h safety TTL. - `task-meta:by-worker:{workerId}` — used for `lockToVersion` triggers. Immutable post-create. 30d sliding TTL so historical workers age out. Cache writes use Lua scripts via `defineCommand` so `DEL` + `HSET` + `EXPIRE` land atomically — concurrent readers never see the empty intermediate state of a naive pipeline. Read-path back-fill uses single-field upserts so concurrent back-fills don't wipe each other's siblings. The cache lives behind its own `TASK_META_CACHE_REDIS_*` env-var prefix that falls back to the default `REDIS_*` set, so operators can route the cache to a dedicated Redis instance if they want. The service/instance file split (`taskMetadataCache.server.ts` for the pure class, `taskMetadataCacheInstance.server.ts` for the env-wired singleton) mirrors the existing `runsReplicationService` / `runsReplicationInstance` pattern. ## Test plan - [ ] `pnpm run typecheck --filter webapp` - [ ] `pnpm run test ./test/engine/triggerTask.test.ts --run` — 8 existing tests untouched + 5 new tests covering warm cache, cold miss with back-fill, queue + ttl path, by-worker vs env keyspace, and the promotion cache write - [ ] End-to-end against a dev worker: registering writes both keyspaces with the expected TTLs, and `redis-cli HGETALL "tr:task-meta:env:<envId>"` returns the cached entries ## Benchmark Measured `DefaultQueueManager.resolveQueueProperties` against a real Postgres + Redis (vitest `containerTest`, single-host docker). 500 sequential calls and 2,000 parallel calls (concurrency=50) per scenario, request shaped as `{ taskId, queue: "bench-queue", ttl: "5m" }` — the hot path this PR restores. ``` sequential (one in flight at a time): [noop cache (baseline)] n=500 mean=1.423ms p50=1.394ms p95=1.735ms p99=2.629ms max=11.100ms [redis cache, cold ] n=500 mean=1.346ms p50=1.283ms p95=1.688ms p99=2.463ms max=5.058ms [redis cache, warm ] n=500 mean=0.084ms p50=0.078ms p95=0.105ms p99=0.156ms max=1.129ms speedup (warm vs baseline, sequential): 16.95x parallel (concurrency=50): [noop cache (baseline)] n=2000 mean=10.069ms p50=8.850ms p95=14.718ms p99=31.887ms total=405ms ops/s=4,940 [redis cache, warm ] n=2000 mean=0.614ms p50=0.568ms p95=1.189ms p99=1.432ms total=25ms ops/s=80,389 throughput speedup (warm vs baseline, parallel): 16.27x ``` Read: - **Warm cache cuts resolver latency 17×** at p50 — from ~1.4 ms to ~78 µs per call. - **Cold cache is on par with baseline** — the extra `HGET` miss adds <50 µs against the two Postgres queries that follow, so the worst case is not worse than today. - **Under burst load (50 concurrent triggers)**, the baseline's p99 jumps to ~32 ms as Postgres connections queue up; warm stays at ~1.4 ms. The cache moves the saturation point from ~5k ops/s (PG pool) to ~80k ops/s (single-client Redis pipelining). Caveats: single-host docker, local Postgres + Redis, resolver-only measurement (excludes the rest of the trigger transaction). Prod adds region-local Redis RTT (~0.3–0.8 ms) which shifts warm absolute numbers up but keeps the ratio intact.
…3963) ## Summary `chat.headStart` (the warm step-1 fast path) previously handed its response over only to `chat.agent`. This extends handover to the other two backends: `chat.customAgent` consumes it with `conversation.consumeHandover({ payload })` on turn 0, and `chat.createSession` surfaces it as `turn.handover` (call `turn.complete()` with no source to finalize a pure-text handover). The low-level `chat.waitForHandover()` and `accumulator.applyHandover()` are exported for hand-rolled loops. It also adds `triggerConfig` to `chat.headStart()` and `chat.openSession()`, so the auto-triggered handover-prepare run inherits tags, queue, machine, and the other session run options the same way `chat.createStartSessionAction()` does. The `chat:{chatId}` tag is prepended automatically. Because the session is created once on the first head-start turn (idempotent on the chat id), this is the only place those options can be set for a head-start chat's lifetime. ## Fix: tool-call resume When the warm step-1 hands over a pending tool call (rather than pure text), the agent loop resumes that tool round. For it to merge cleanly the pipe threads the spliced partial as `originalMessages`, so the resumed tool-output chunk attaches to the handed-over tool-call instead of throwing `No tool invocation found`. `MessageAccumulator.addResponse` now also dedups by id (replace-in-place), so the persisted history doesn't carry a duplicate assistant message when the resumed response reuses the partial's id. Incorporates the `triggerConfig` work from [#3933](#3933) by @saasjesus, with `createStartSessionAction` extended to also forward `maxDuration`, `region`, and `lockToVersion` so the two session entry points stay consistent. Verified end-to-end against a local environment: handover (pure-text and tool-call) on both new backends, a `chat.agent` regression pass, and `triggerConfig` tags and queue landing on the run. --------- Co-authored-by: saasjesus <armin@chatarmin.com>
## Summary Reworks the scheduled task page right-hand sidebar. - Adds **Overview** / **Schedules** tabs. The Schedules tab is a paginated table of all schedules attached to the task, declarative first. - Surfaces schedule fields (ID, CRON + human-readable description, next/last run, status) directly in the Overview property table. - Sidebar can be dragged much wider (up to 80% of the viewport). - "No schedules attached" panel explains declarative vs imperative and links to docs. - Schedule **create / edit / enable / disable / delete** all happen inside the existing Sheet — no more navigating to the standalone schedule page. Toasts confirm each action. ## Test plan - Open a scheduled task page and verify the new tabs - Create, edit, enable/disable, and delete a schedule — confirm you stay on the page and see a toast each time - Visit a task with no schedules attached and confirm the info panel renders - Drag the sidebar wider; confirm pagination shows when there are >25 schedules
## Summary Docs deploy from the `docs-live` branch via Mintlify, so merging to `main` no longer publishes docs on its own. To publish, push a `docs-release-*` tag at the commit you want live. The workflow runs the Mintlify broken-links check against that commit, then fast-forwards `docs-live` to it, which is what Mintlify deploys from. ## Design The ref move uses the GitHub API with `force=false`, making it fast-forward only: a tag that is not ahead of `docs-live` fails the job rather than rewinding production. Mintlify's GitHub app reacts to the resulting push and deploys, so no extra deploy credentials are needed. Usage: ```bash git tag docs-release-2026.06.16 # tag the main commit you want live git push origin docs-release-2026.06.16 ```
…3964) ## Summary `chat.headStart` now works with the `chat.customAgent` and `chat.createSession` backends (not just `chat.agent`), and takes a `triggerConfig` option. These docs cover both. The Fast starts guide gets a "Handover with custom agents" section showing how each backend consumes the handover (`consumeHandover` returning `{ isFinal, skipped }` for custom agents, `turn.handover` for createSession), including threading `originalMessages` so a resumed tool round merges into the handed-over assistant. The `chat.headStart` API section documents `triggerConfig` (tags, queue, machine, and the rest) on the auto-triggered run. The reference picks up `ChatTurn.handover`, `turn.complete()` with no source, `chat.waitForHandover`, and a new `HeadStartHandlerOptions` table. Docs for the SDK changes in [#3963](#3963).
…served keys (#3966) Fix Vercel onboarding wizard to properly filter out reserved TRIGGER_ env vars
## Summary New `/ai-chat/prompt-caching` guide covering how to cache a chat agent's prompt prefix with Anthropic prompt caching: the system prompt, the conversation history (a `prepareMessages` breakpoint), and how caching interacts with compaction. It also shows how to verify cache hits via usage and the dashboard, the prefix-stability footguns, and an "Other providers" section (OpenAI and Google cache automatically; Amazon Bedrock uses `cachePoint` through `systemProviderOptions`). Registered under Features in the AI Agents nav, next to Compaction. --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Eric Allam <ericallam@users.noreply.github.com>
## Summary The "What extractNewToolResults returns" reference in the tool-result-auditing guide did not match the SDK. It listed an `input` field that `chat.history.extractNewToolResults()` never returns, and marked `output` as optional when it is always present. This corrects the block to the real `ChatNewToolResult` shape (`toolCallId`, `toolName`, `output`, optional `errorText`). Every usage example in the same guide already reads only those fields, so the reference now matches both the examples and the code.
…3958) ## Summary The Models page is now split into two tabs. **Your models** shows the models your project has actually used in the selected time range, with usage charts (cost over time, tokens over time, calls by model), a per-model table of calls / cost / avg TTFC / avg tokens-per-sec, and calls/tokens trend sparklines. **Model library** is the full catalog, reordered from alphabetical to a relevance-based provider order (Anthropic, OpenAI, Google, then the rest), newest models first within each provider, with a "New" badge on models released in the last 7 days. One time-range selector drives the whole Your models tab, so the charts, the table, and the sparklines all share the same window. Opening a model shows its own metrics with an independent range picker and a "View in AI metrics" link that opens the AI metrics dashboard filtered to that model. The active tab is kept in the URL so it survives a refresh and is shareable. ## Prompt caching & cost accuracy Both the Your models tab and the AI metrics dashboard now surface prompt-cache usage: a cache-savings column plus per-model cached-tokens and cache-hit-rate views, and a caching section on the dashboard (hit rate, cached tokens, estimated savings, and hit rate by model). Building this surfaced a cost bug. `input_tokens` is the total prompt count and already includes cache-read and cache-creation tokens, but the cost pipeline charged the full input at the input price and then added a separate cache line, so cached tokens were billed twice (and on Anthropic, cache reads were never discounted because their price is keyed differently). The input price now applies only to the non-cached remainder, with cache prices resolved across the provider-specific keys, so LLM cost and the cache hit-rate metric are accurate. Hit rate is computed as cached reads over total input. ## Notes Also fixes React "invalid DOM property" console warnings from the provider icons (the Llama and DeepSeek SVGs used raw `fill-rule` / `clip-rule` / `clip-path` attributes), which this page surfaces by rendering more provider icons. ## Screenshots **Your models tab:** usage charts and a per-model table with calls/tokens trend sparklines. <img width="2560" height="1267" alt="1-your-models-tab" src="https://github.com/user-attachments/assets/859bd24f-9047-4828-8bbb-83e5882846d6" /> **Model library:** provider-relevance ordering with a "New" badge on models released in the last 7 days. <img width="2560" height="1267" alt="2-model-library-tab" src="https://github.com/user-attachments/assets/46dd54b9-80f9-4922-ade9-5935b08dfebc" /> **Model detail, Metrics tab:** per-model range picker and a "View in AI metrics" link. <img width="2560" height="1267" alt="3-model-detail-metrics" src="https://github.com/user-attachments/assets/0f65d9d0-6142-4918-93f0-110bb277101a" /> **View in AI metrics:** the dashboard deep-linked and filtered to the selected model. <img width="2560" height="1267" alt="4-ai-metrics-filtered" src="https://github.com/user-attachments/assets/821f256c-e305-493c-98c7-eafaf2f57f83" />
…#3939) ## Summary The agent skills' deep guidance now ships inside `@trigger.dev/sdk` and is read from `node_modules`, so it tracks the `@trigger.dev/sdk` version installed in your project automatically. This updates the Skills page, the Building with AI step, and the rules-redirect page to drop the old "pinned to the CLI version, re-run to refresh" framing and describe the version-pinned reference instead. Pairs with the SDK/CLI change in #3937. Keep this draft until that ships, since it describes behavior that is not released yet.
## Summary Typing in the search bar on the task page could clear or reset the input mid-keystroke. This fixes the re-render race so the field stays stable while you type. ## Root cause Two things compounded: - `SearchInput`'s sync effect depended on `text`, so it re-ran on every keystroke and could overwrite the input with the URL/controlled value while focused. - Each task row unmounted and remounted its activity chart during the side-panel open/close animation (25 charts at once), forcing heavy re-renders that the search effect raced against. ## Fix - `SearchInput` now tracks the last synced value in a ref instead of comparing against `text`, keeping the effect off the keystroke path. It only writes to state when the incoming URL/controlled value actually changes, and never while the input is focused. - Activity charts are now hidden (`hidden` attribute) instead of unmounted during the panel animation, so the rows don't churn the tree and the resize stays smooth. --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ngs (#3970) ## Summary Three improvements to the SDK-bundled agent skills (follow-up to the skills installer): - **`trigger-` namespace.** The installed skills (`authoring-tasks`, `getting-started`, …) had generic names that collide with unrelated skills in a shared agent skills directory. They're now prefixed — `trigger-authoring-tasks`, `trigger-getting-started`, etc. — matching the convention the public skills repo already uses. - **New `trigger-cost-savings` skill.** An MCP-driven cost audit: right-sizes machines, flags missing `maxDuration`, spots sequential triggers that could batch, and reviews schedule frequency, using `list_runs` / `get_run_details` for live analysis. - **Bundle the full docs.** `@trigger.dev/sdk` now bundles the entire "Documentation" section of the docs (157 pages) instead of a curated 55-page subset, so an agent has the complete, version-pinned reference in `node_modules`. ## How the bundling works `scripts/bundleSdkDocs.ts` now reads `docs/docs.json`, walks the "Documentation" dropdown, and copies every page under it into the SDK. The set tracks the docs navigation automatically — add a page to the nav and it ships, no skill edits needed. The API reference and Guides & examples dropdowns are intentionally excluded. A skill's `sources:` frontmatter is now informational only. The dropped idea of a dedicated `trigger-config` skill is replaced by references to the bundled build-extension docs (`config/extensions/*`) from the `trigger-authoring-tasks` config section and the chat-agent skills.
Adds an opt-in mechanism to route a configurable percentage of organizations onto the compute (MicroVM) backing of their region at trigger time, without changing their stored region settings. Routing is gated by three global feature flags - `computeMigrationEnabled`, `computeMigrationFreePercentage`, `computeMigrationPaidPercentage` - plus a per-org `computeMigrationEnabled` override that wins in both directions. A region's compute backing is resolved from a new `WorkerInstanceGroup.region` column: a container group and its MicroVM group share one geo `region`, so the migration swaps the resolved worker queue to the backing group's queue. Orgs are bucketed deterministically by id, so ramping a percentage down keeps a strict subset rather than reshuffling, and a region with no compute backing is never touched. Everything is off by default - behaviour is unchanged unless the flags are set. The flags and the worker-region groups are read on the trigger hot path from in-memory snapshots rather than the database: a small `createReloadingRegistry` helper loads each at startup and refreshes them on an interval, so no per-trigger query is added and a percentage or kill-switch change propagates within the reload interval. A cold replica whose snapshot hasn't loaded yet reads as not-migrated (the container path) and self-corrects on the next load - the same cold-start contract as the datastore / LLM-pricing registries, with a `reloading_registry_loaded` metric so a never-loaded registry is alertable. The same migration decision is consulted at deploy-time template creation so a migrated org gets a compute template built ahead of its first run. This runs in shadow mode (best-effort, never fails the deploy) by default, or - when the `computeMigrationRequireTemplate` flag is on - in required mode, built synchronously at deploy so the first run never builds on-demand and template errors surface at deploy time. So operators keep "which runs ran where" while customers only see geography: the run's actual worker queue is stored raw, and the geo region is stamped separately on `TaskRun.region` (and a new ClickHouse `region` column) at trigger time. Read surfaces - the dashboard, the API, and the Query/Logs page - show the geo region, falling back to the worker queue for runs written before the column existed. Minor follow-ups left out of scope: the percentage flags render as text inputs on the admin flags page (the catalog UI has no numeric control type yet), and `createReloadingRegistry` could later gain pub/sub for sub-second cross-replica propagation if the reload interval proves too slow.
## Summary 7 improvements. ## Improvements - `@trigger.dev/sdk` now bundles the Trigger.dev agent skills and a curated snapshot of the docs those skills reference. The skills that `trigger skills` installs into your coding agent read this content from node_modules, so the guidance your AI assistant follows is pinned to the SDK version installed in your project and stays current across upgrades instead of going stale until the next reinstall. ([#3937](#3937)) - Running a CLI command like `dev`, `deploy`, `preview`, or `update` before initializing a project no longer crashes with a raw `Cannot find matching package.json` stack trace. The CLI now detects the missing project and points you to `npx trigger.dev@latest init` instead. ([#3929](#3929)) - The agent skills installed by `trigger skills` are now namespaced with a `trigger-` prefix (e.g. `trigger-authoring-tasks`, `trigger-getting-started`) so they don't collide with unrelated skills in your coding agent's skills directory. Adds a `trigger-cost-savings` skill for auditing and reducing compute spend (right-sizing machines, `maxDuration`, batching, debounce), and `@trigger.dev/sdk` now bundles the full Trigger.dev documentation so your agent can read the complete, version-pinned reference directly from node_modules. ([#3970](#3970)) - The run span API response now includes `cachedCost` and `cacheCreationCost` on the `ai` object, alongside the existing `inputCost` / `outputCost` / `totalCost`. `inputCost` reflects only the non-cached input, so these fields let you reconstruct the full cost breakdown for prompt-cached calls. ([#3958](#3958)) - `chat.headStart` now works with the `chat.customAgent` and `chat.createSession` backends, not only `chat.agent`. The warm step-1 response hands over to your loop the same way it does for a managed agent. ([#3963](#3963)) In a `chat.customAgent` loop, consume the handover on turn 0: ```ts const conversation = new chat.MessageAccumulator(); const { isFinal, skipped } = await conversation.consumeHandover({ payload }); if (skipped) return; // warm handler aborted, so exit without a turn if (isFinal) { await chat.writeTurnComplete(); // step 1 is the response, no streamText } else { const result = streamText({ model, messages: conversation.modelMessages, tools }); // Pass originalMessages so the handed-over tool round merges into the // step-1 assistant instead of starting a new message. const response = await chat.pipeAndCapture(result, { originalMessages: conversation.uiMessages, }); if (response) await conversation.addResponse(response); } ``` With `chat.createSession`, the iterator surfaces it as `turn.handover`; call `turn.complete()` with no argument on a final handover. The lower-level `chat.waitForHandover()` and `accumulator.applyHandover()` are also exported for hand-rolled loops. - Cache your chat agent's system prompt with Anthropic prompt caching. `chat.toStreamTextOptions()` now emits the system prompt as a cacheable message when you opt in, so a large, stable system block is billed at cache-read rates on every turn instead of full price. ([#3952](#3952)) ```ts // at the streamText call site (Anthropic sugar) streamText({ ...chat.toStreamTextOptions({ cacheControl: { type: "ephemeral" } }), messages, }); // provider-agnostic equivalent chat.toStreamTextOptions({ systemProviderOptions: { anthropic: { cacheControl: { type: "ephemeral" } } }, }); // or where the prompt is defined chat.prompt.set(SYSTEM_PROMPT, { providerOptions: { anthropic: { cacheControl: { type: "ephemeral" } } }, }); ``` Without an option, `system` stays a plain string. Pairs with a `prepareMessages` cache breakpoint to cache the conversation prefix across turns too. - Three fixes for custom agent loops (`chat.customAgent`, `chat.createSession`, and hand-rolled `MessageAccumulator` loops): ([#3936](#3936)) - Continuation runs no longer replay already-answered user messages into the first turn. The `.in` resume cursor is now seeded before any listener attaches (the same boot logic `chat.agent` uses), so a chat that continues after a cancel, crash, or upgrade only sees genuinely new messages. - Steering a hand-rolled loop mid-stream no longer wipes the in-flight assistant response. `chat.pipeAndCapture` now stamps a server-generated message id on the stream, so a `prepareStep` injection keeps the partial text instead of replacing the message. - Task-backed tools (`ai.toolExecute`) now work from custom agent loops: the parent's session is threaded to the child run, so child tasks can stream progress into the chat with `chat.stream.writer({ target: "root" })` instead of failing with "session handle is not initialized". <details> <summary>Raw changeset output</summary>⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ `main` is currently in **pre mode** so this branch has prereleases rather than normal releases. If you want to exit prereleases, run `changeset pre exit` on `main`.⚠️ ⚠️ ⚠️ ⚠️ ⚠️ ⚠️ # Releases ## @trigger.dev/build@4.5.0-rc.7 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.5.0-rc.7` ## trigger.dev@4.5.0-rc.7 ### Patch Changes - `@trigger.dev/sdk` now bundles the Trigger.dev agent skills and a curated snapshot of the docs those skills reference. The skills that `trigger skills` installs into your coding agent read this content from node_modules, so the guidance your AI assistant follows is pinned to the SDK version installed in your project and stays current across upgrades instead of going stale until the next reinstall. ([#3937](#3937)) - Running a CLI command like `dev`, `deploy`, `preview`, or `update` before initializing a project no longer crashes with a raw `Cannot find matching package.json` stack trace. The CLI now detects the missing project and points you to `npx trigger.dev@latest init` instead. ([#3929](#3929)) - The agent skills installed by `trigger skills` are now namespaced with a `trigger-` prefix (e.g. `trigger-authoring-tasks`, `trigger-getting-started`) so they don't collide with unrelated skills in your coding agent's skills directory. Adds a `trigger-cost-savings` skill for auditing and reducing compute spend (right-sizing machines, `maxDuration`, batching, debounce), and `@trigger.dev/sdk` now bundles the full Trigger.dev documentation so your agent can read the complete, version-pinned reference directly from node_modules. ([#3970](#3970)) - Updated dependencies: - `@trigger.dev/core@4.5.0-rc.7` - `@trigger.dev/build@4.5.0-rc.7` - `@trigger.dev/schema-to-json@4.5.0-rc.7` ## @trigger.dev/core@4.5.0-rc.7 ### Patch Changes - The run span API response now includes `cachedCost` and `cacheCreationCost` on the `ai` object, alongside the existing `inputCost` / `outputCost` / `totalCost`. `inputCost` reflects only the non-cached input, so these fields let you reconstruct the full cost breakdown for prompt-cached calls. ([#3958](#3958)) ## @trigger.dev/python@4.5.0-rc.7 ### Patch Changes - Updated dependencies: - `@trigger.dev/sdk@4.5.0-rc.7` - `@trigger.dev/core@4.5.0-rc.7` - `@trigger.dev/build@4.5.0-rc.7` ## @trigger.dev/react-hooks@4.5.0-rc.7 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.5.0-rc.7` ## @trigger.dev/redis-worker@4.5.0-rc.7 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.5.0-rc.7` ## @trigger.dev/rsc@4.5.0-rc.7 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.5.0-rc.7` ## @trigger.dev/schema-to-json@4.5.0-rc.7 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.5.0-rc.7` ## @trigger.dev/sdk@4.5.0-rc.7 ### Patch Changes - `@trigger.dev/sdk` now bundles the Trigger.dev agent skills and a curated snapshot of the docs those skills reference. The skills that `trigger skills` installs into your coding agent read this content from node_modules, so the guidance your AI assistant follows is pinned to the SDK version installed in your project and stays current across upgrades instead of going stale until the next reinstall. ([#3937](#3937)) - `chat.headStart` now works with the `chat.customAgent` and `chat.createSession` backends, not only `chat.agent`. The warm step-1 response hands over to your loop the same way it does for a managed agent. ([#3963](#3963)) In a `chat.customAgent` loop, consume the handover on turn 0: ```ts const conversation = new chat.MessageAccumulator(); const { isFinal, skipped } = await conversation.consumeHandover({ payload }); if (skipped) return; // warm handler aborted, so exit without a turn if (isFinal) { await chat.writeTurnComplete(); // step 1 is the response, no streamText } else { const result = streamText({ model, messages: conversation.modelMessages, tools }); // Pass originalMessages so the handed-over tool round merges into the // step-1 assistant instead of starting a new message. const response = await chat.pipeAndCapture(result, { originalMessages: conversation.uiMessages, }); if (response) await conversation.addResponse(response); } ``` With `chat.createSession`, the iterator surfaces it as `turn.handover`; call `turn.complete()` with no argument on a final handover. The lower-level `chat.waitForHandover()` and `accumulator.applyHandover()` are also exported for hand-rolled loops. - Add `triggerConfig` support to `chat.headStart()` and `chat.openSession()`, so the auto-triggered handover-prepare run inherits tags, queue, machine, and other session trigger options the same way `chat.createStartSessionAction()` does. The `chat:{chatId}` tag is prepended automatically. ([#3963](#3963)) ```ts export const POST = chat.headStart({ agentId: "my-agent", triggerConfig: { tags: ["org:acme"], queue: "chat" }, run: async ({ chat }) => streamText({ ...chat.toStreamTextOptions(), model }), }); ``` Because the session is created once on the first head-start turn and is idempotent on the chat id, this is the only place to set those options for a head-start chat's lifetime. `chat.createStartSessionAction()` now also forwards `maxDuration`, `region`, and `lockToVersion` so both session entry points stay consistent. - Cache your chat agent's system prompt with Anthropic prompt caching. `chat.toStreamTextOptions()` now emits the system prompt as a cacheable message when you opt in, so a large, stable system block is billed at cache-read rates on every turn instead of full price. ([#3952](#3952)) ```ts // at the streamText call site (Anthropic sugar) streamText({ ...chat.toStreamTextOptions({ cacheControl: { type: "ephemeral" } }), messages, }); // provider-agnostic equivalent chat.toStreamTextOptions({ systemProviderOptions: { anthropic: { cacheControl: { type: "ephemeral" } } }, }); // or where the prompt is defined chat.prompt.set(SYSTEM_PROMPT, { providerOptions: { anthropic: { cacheControl: { type: "ephemeral" } } }, }); ``` Without an option, `system` stays a plain string. Pairs with a `prepareMessages` cache breakpoint to cache the conversation prefix across turns too. - Three fixes for custom agent loops (`chat.customAgent`, `chat.createSession`, and hand-rolled `MessageAccumulator` loops): ([#3936](#3936)) - Continuation runs no longer replay already-answered user messages into the first turn. The `.in` resume cursor is now seeded before any listener attaches (the same boot logic `chat.agent` uses), so a chat that continues after a cancel, crash, or upgrade only sees genuinely new messages. - Steering a hand-rolled loop mid-stream no longer wipes the in-flight assistant response. `chat.pipeAndCapture` now stamps a server-generated message id on the stream, so a `prepareStep` injection keeps the partial text instead of replacing the message. - Task-backed tools (`ai.toolExecute`) now work from custom agent loops: the parent's session is threaded to the child run, so child tasks can stream progress into the chat with `chat.stream.writer({ target: "root" })` instead of failing with "session handle is not initialized". - The agent skills installed by `trigger skills` are now namespaced with a `trigger-` prefix (e.g. `trigger-authoring-tasks`, `trigger-getting-started`) so they don't collide with unrelated skills in your coding agent's skills directory. Adds a `trigger-cost-savings` skill for auditing and reducing compute spend (right-sizing machines, `maxDuration`, batching, debounce), and `@trigger.dev/sdk` now bundles the full Trigger.dev documentation so your agent can read the complete, version-pinned reference directly from node_modules. ([#3970](#3970)) - Updated dependencies: - `@trigger.dev/core@4.5.0-rc.7` ## @trigger.dev/plugins@4.5.0-rc.7 ### Patch Changes - Updated dependencies: - `@trigger.dev/core@4.5.0-rc.7` </details> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Replicates `TaskRun.planType` into the `task_runs_v2` ClickHouse table so run analytics can group by plan type. Adds a `plan_type` column (goose migration `033`, `LowCardinality(String)`), the replication insert mapping, and the matching schema/column/type entries - same shape as the recent `region` addition. Write-once at trigger, so it just rides along on existing replicated rows. Internal analytics only; not exposed in the Query API.
#3960) ## Summary Prisma infrastructure failures (P1xxx-class: database unreachable, timed out, connection dropped, engine init/panic) carry the database hostname in their `.message`. This captures them centrally for observability and ensures they never reach API clients verbatim. ## Design A `$allOperations` client extension on the writer and replica clients logs infrastructure errors with the originating model and operation, then rethrows the **original** error unchanged — call sites that branch on `error.code` (unique-violation idempotency, not-found handling) and transaction retries keep working. Only infrastructure errors are logged; routine query/validation errors (P2xxx) are left alone. `$allOperations` can't see the transaction boundary (`$transaction` is a client method, not an operation), so infrastructure errors surfacing from `$transaction()` without a Prisma code — e.g. `PrismaClientInitializationError` — are logged separately at the transaction wrapper, where the existing coded-error path would otherwise miss them. `clientSafeErrorMessage()` swaps an infrastructure error's message for `"Internal Server Error"` at the API routes that previously returned `error.message` raw. Status codes, headers, and every non-infrastructure message are unchanged. ## Test plan - [x] P2002 / P2025 rethrow with code intact and are not logged - [x] Statement errors inside `$transaction` keep their code (retry logic intact) - [x] Raw queries wrapped without crashing on the undefined model - [x] A genuine connectivity failure is logged with model/operation/code - [x] `clientSafeErrorMessage` obfuscates infra messages, preserves all others - [x] `pnpm run typecheck --filter webapp` (12/12) ## Note Overlaps with #3391 (Prisma 7 migration) on `apps/webapp/app/db.server.ts` — coordinate rebasing.
The global feature flags admin page had a few rough edges. The percentage flags are numeric (`z.coerce.number()`) but rendered as free-text inputs, so you could type non-numeric values that only failed validation after submitting - and the error surfaced behind the confirm dialog. The control-type detection now recognises numbers and renders a proper number input, with the min/max range as the placeholder so the type is clear even when the field is unset. The save error also shows inside the confirm dialog now, not just behind it. The action buttons were unreachable without zooming out. The admin layout wrapped each page in a plain block, so `h-full` page content overran the viewport by the height of the tab bar and got clipped by the `overflow-hidden` body. Making the layout a flex column bounds each page to the space below the tabs, so the existing per-page scroll works and the feature flags page scrolls like the Users/Orgs tabs. Also capped the confirm dialog's diff list so its footer stays on screen when there are many changes.
## Summary
Adds a "Duration and cost while paused" section to the human-in-the-loop
page. It explains that a HITL pause (a no-execute tool waiting on
`addToolOutput`) suspends the run and frees compute, so the human's
thinking time does not count against `maxDuration` (which measures
active CPU time and excludes suspended waitpoint time, the same as
`wait.for`). Customers don't need to raise `maxDuration` or end the run
to support long human waits.
This was a recurring point of confusion: readers assumed the pause holds
the run open and burns the budget. Also updates the how-it-works
pseudocode ("Agent suspends (compute freed)") and links `wait.for` and
`maxDuration` on first mention.
## Summary Adds a "Stopping generation" section to the Custom agents page. It documents how stop works when you drop down from `chat.agent` to `chat.createSession`: pass `turn.signal` (a combined stop-and-cancel `AbortSignal`) to `streamText`, and `turn.complete()` cleans up the aborted partial, accumulates it as its own assistant message, and keeps the run alive for the next turn. `turn.stopped` distinguishes a user stop from a full run cancel. Until now the createSession stop story only existed as scattered fields in the reference table; the client side (`transport.stopGeneration`) and the `chat.agent` run-callback signals were documented, but not the custom-agent turn loop. Steering for these backends is already covered on the pending messages page, which this page links to.
…lling routes (#3948) ## Summary Several dashboard routes performed actions a restricted role should not be able to do (cancel or replay runs, manage prompt versions, invite and manage members, manage billing) without any permission check. This adds role-based permission enforcement to those routes, and disables the matching UI controls (with a tooltip) when the current role lacks permission. Covered actions: - Runs: cancel and replay (single, bulk create, bulk abort) - Prompts: create or edit override versions, and promote a version to current - Members: invite, resend invite, revoke invite - Billing: change plan, billing alerts, and the customer portal ## How Each affected route now goes through the `dashboardLoader` / `dashboardAction` route builders with an `authorization` block declaring the required permission (or a per-intent check where one route handles several intents). Existing tenancy and data-scoping queries are untouched; this only layers permission checks on top. The UI follows disable-don't-hide: controls stay visible but disabled with a "You don't have permission to ..." tooltip. Two reusable pieces support this: `checkPermissions(ability, checks)` turns a set of checks into a boolean map a loader returns to the client, and `PermissionButton` / `PermissionLink` disable the underlying control and show a tooltip when a permission flag is false. ## Behaviour No change in the default configuration: permissions are permissive, so every control stays enabled and every route behaves as before. The checks only take effect when an RBAC plugin is installed. This also makes role assignment on invite-accept non-fatal, so a failure there cannot block joining an org. Verified with `pnpm run typecheck --filter webapp`; `checkPermissions` has unit tests.
A batch of technical-SEO fixes across the docs, all reader-facing (titles, links, redirects): - Canonicalize the duplicate CLI command pages: the bare `/cli-dev` and `/cli-deploy` paths now permanently redirect to their `-commands` equivalents, and a duplicate navigation entry is removed. - Give the three pages that all rendered as "Overview" distinct titles (Building with AI, self-hosting overview, Management API overview), with sidebar labels unchanged. - Replace the generic "Learn more" links in the introduction's build-extension list with descriptive anchor text. - Switch two http links to https in the Supabase guides, point a troubleshooting page's help link to Discord, and add missing meta descriptions to three help and troubleshooting pages.
## ✅ Checklist - [x] I have followed every step in the [contributing guide](https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md) - [x] The PR title follows the convention. - [x] I ran and tested the code works --- ## Testing Ran the webapp locally with the change applied; it compiles and serves. The edit only swaps the chart card title string from "LLM spend" to "LLM spend ($)" on the agent landing page. --- ## Changelog The agent dashboard "LLM spend" chart label now includes the currency unit, reading "LLM spend ($)". --- ## Screenshots _[Screenshots]_ 💯
Pushes new organizations and users into the Attio CRM at signup time, for Customer Success (TRI-10431). - Orgs → Attio `workspaces`, users → Attio `users`, keyed on Attio's built-in unique `workspace_id` / `user_id` so writes are idempotent upserts. - Runs on the common Redis worker (not inline), so a slow or unavailable Attio never blocks the signup path; failures retry (3 attempts). - Hooks: user-created (alongside the existing Loops call) and org-created (`createOrganization`). - Gated behind `ATTIO_API_KEY`, no key means the sync is skipped entirely, so OSS / self-hosted installs are unaffected. Only creation is covered here (the record "shell"); spend, runs, plan changes, churn, and role/relationship linking are populated by the scheduled full sync, tracked separately. **Deploy note:** requires an Attio API key set as `ATTIO_API_KEY` in the webapp env, with scopes **Records (read-write)** + **Object Configuration (read)**, the assert/upsert endpoint reads object config to resolve the matching attribute. Without the key the sync no-ops. --------- Co-authored-by: Matt Aitken <matt@mattaitken.com> Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
## Summary Adds the `4.5.0-rc.7` entry to the AI chat changelog, covering the agent-facing changes in [v4.5.0-rc.7](https://github.com/triggerdotdev/trigger.dev/releases/tag/v4.5.0-rc.7): - `chat.headStart` now works with the `chat.customAgent` and `chat.createSession` backends, not just `chat.agent` - Opt-in Anthropic system-prompt caching via `chat.toStreamTextOptions()` - Three custom-agent-loop fixes: continuation replay, mid-stream steering, and task-backed tools - `trigger skills` follow-ups: `trigger-` namespacing, SDK-bundled docs, and a new cost-savings skill Generic, non-agent rc.7 items (the CLI uninitialized-project error message, run-span cost fields) are intentionally left out to keep this changelog scoped to AI chat agents.
## Summary The Personal Access Tokens page now shows each token's maximum role in a new column, so you can see at a glance what a token is capped to. The column only appears when an RBAC plugin is installed, and shows "-" for tokens with no cap. Its header tooltip reuses the same explanation shown in the create-token panel.
## Summary Replaces the multi-select popover task type filter on the Tasks page with a single-select segmented control: **All** plus icon-only **Agent**, **Standard**, and **Scheduled** segments. Each segment has a tooltip showing its label and a number-key shortcut (0-3), and the search field no longer autofocuses so the shortcuts work on page load. ## ✅ Checklist - [x] I have followed every step in the [contributing guide](https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md) - [x] The PR title follows the convention. - [x] I ran and tested the code works
#3997) ## Summary Adds a short-lived, delegated token (`tr_uat_...`) that authenticates against the API as a user without handing out a long-lived personal access token. You mint one from a PAT, optionally narrow it to a set of scopes, and give it a lifetime; the API then treats requests as that user, subject to their role. `trigger.dev mint-token` is the entry point (it uses your stored PAT): ```bash UAT=$(trigger.dev mint-token --ttl 3600 --cap read:runs) ``` The token works anywhere a PAT does for user-level endpoints, and can be exchanged for an environment JWT at `POST /api/v1/projects/:ref/:env/jwt` to reach environment-scoped data (the same exchange a PAT supports). ## How it works A user-actor token is a short-lived JWT verified by a new first-class `authenticateUserActor` method on the RBAC plugin. Self-hosters get a built-in fallback; role-aware enforcement comes from the plugin. Effective permissions are the intersection of the user's role and the token's optional scope cap, so a token is only ever narrower than the user, never broader. Minting is restricted to personal access tokens (a token can't mint another one, and an environment key can't mint one). Tokens default to a 1 hour lifetime (max 365 days). When exchanged for an environment JWT, the user is stamped on it for attribution and the scope cap is carried through.
## Summary The CLAUDE.md audit job (`.github/workflows/claude-md-audit.yml`) frequently hits its 15-turn cap before it finishes reviewing a PR, so the job fails without posting a verdict. For example, the audit job failed on [this run](https://github.com/triggerdotdev/trigger.dev/actions/runs/27837408945/job/82390460772?pr=3990). This raises `--max-turns` from 15 to 25 to give the review room to complete, and pins `--model claude-opus-4-8` (the job previously inherited the action default model).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #
✅ Checklist
Testing
[Describe the steps you took to test this change]
Changelog
[Short description of what has changed]
Screenshots
[Screenshots]
💯